Methods and systems for providing and playing videos having multiple tracks of timed text over a network

ABSTRACT

The present invention relates to video provided over one or more networks. Methods and systems for providing, playing, and/or editing video having multiple tracks of timed text are provided in different embodiments of the present invention.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to video provided over a network.

2. Background Art

Video is increasingly being accessed by remote users over networks, suchas the Internet. The rise of the World Wide Web including various webapplications, protocols, and related networking and computingtechnologies has made it possible for remote users to view and playvideo. Video services that allow users to search different videos andselect videos through a browser have become increasing popular.

Video content often includes an audio component, such as, speech, music,and other sound. Timed text (TT), such as captions or subtitles, issometimes provided with video content. Such timed text can be helpful tothose who are deaf or hard of hearing, or who are in environments whereit is difficult or not permitted to hear audio, or to those whom theaudio is not in their native language.

In broadcast video or video professionally produced and distributed onDVD or other formats, sophisticated techniques have been used by videoproducers or professional caption companies to add captions in one ormore languages. These techniques involve embedding or adding captions atthe time a video is created prior to distribution. At playback a user islimited to what captions are present on the DVD.

Unlike broadcast video, online video is often produced by a wide rangeof sources and people. This can include a person with a video camerahaving no captioning capability or skill. Accordingly, much of theonline video content available today does not include timed text. To addtimed text requires the services of an expensive professional captioningservice and essentially amounts to redistributing the video with timedtext. This is expensive, slow, and impractical for many online videos.Even in cases where an online video is produced and distributed with atrack of timed text, it is often only provided by the video producer inone language which may not suit a large number of remote users havingdifferent native languages. Current online video players and services donot customarily provide for the display of multiple tracks of timedtext.

What are needed are new systems and methods for providing, playing,and/or editing of online video that can accommodate multiple tracks oftimed text.

BRIEF SUMMARY OF THE INVENTION

The present invention relates to video provided over one or morenetworks. Methods and systems for providing, playing, and/or editingvideo having multiple tracks of timed text are provided in differentembodiments of the present invention.

Further embodiments, features, and advantages of the invention, as wellas the structure and operation of the various embodiments of theinvention are described in detail below with reference to accompanyingdrawings.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be described with reference to theaccompanying drawings, wherein like reference numbers indicate identicalor functionally similar elements. Also, the leftmost digit(s) of thereference numbers identify the drawings in which the associated elementsare first introduced.

FIG. 1A is a diagram of a system for providing and playing multipletracks of timed text according to an embodiment of the presentinvention.

FIG. 1B is a diagram of the system of FIG. 1A further including a systemfor editing online video having multiple tracks of timed text accordingto another embodiment of the present invention.

FIG. 2 is a flow diagram of a method for providing videos havingmultiple tracks of timed text over a network according to an embodimentof the present invention.

FIG. 3 is a flow diagram of a method for playing videos having multipletracks of timed text received over a network according to an embodimentof the present invention.

FIG. 4 is a diagram showing an example operation of providing andplaying videos having multiple tracks of timed text over a networkaccording to an embodiment of the present invention.

FIG. 5 is a screen capture of an example window for playing online videohaving multiple tracks of timed text according to an embodiment of thepresent invention.

FIG. 6 is a flow diagram of a method for editing videos having multipletracks of timed text over a network according to an embodiment of thepresent invention.

FIGS. 7A to 7C are screen captures showing an example user-interfacepanel for editing multiple timed text tracks in a status windowaccording to an embodiment of the present invention.

FIGS. 8A and 8B are screen captures showing an example user-interfacepanel for managing a set of video files available for editing ofassociated timed text tracks according to an embodiment of the presentinvention.

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION

While the present invention is described herein with reference toillustrative embodiments for particular applications, it should beunderstood that the invention is not limited thereto. Those skilled inthe relevant art(s) with access to the teachings provided herein willrecognize additional modifications, applications, and embodiments withinthe scope thereof and additional fields in which the invention would beof significant utility.

The present invention relates to video provided over one or morenetworks. Methods and systems for providing, playing, and/or editingvideo having multiple tracks of timed text (TT) are provided indifferent embodiments of the present invention.

The term “timed text” refers to textual information that isintrinsically or extrinsically associated with timing information.Examples of timed text can include, but are not limited to, captions orsubtitles. A “track” of timed text refers to a composition of timed textdata intended to be used in a period of video playback.

Providing and Playing Video Having Multiple Tracks of TT Over a Network

FIG. 1A shows a system 100A for providing and playing multiple tracks oftimed text according to an embodiment of the present invention. System100A includes a client 130 and server 160. Client 130 can be coupled toserver 160 over a network 120. One or more databases 180 can be coupledto server 160. Network 120 can be one or more computer and/or telephonynetworks, or combinations of networks. Network 120 can be a local areanetwork (LAN), medium area network, or wide area network, such as, theInternet. One client 130 and one server 160 are shown for clarity, andindeed many remote clients 130 can be coupled to one or more servers160.

Client 130 can include a browser 140 and video player 150. In oneexample, video player 150 can be part of or embedded with browser 150.In another example, video player 150 can be separate but coupled tocommunicate with browser 140. Video player 150 can be a custom player,or can be used in combination with a known FLASH player or other type ofvideo player, or can be a modification of a known FLASH player or othertype of video player. Server 160 can further include or be coupled to aweb server (not shown) to support web protocols and communication withremote browser 140.

According to a feature, video player 150 and server 160 can communicateto allow video player 150 to play video having multiple tracks of timedtext over network 120 to a remote user at client 130. The operation ofserver 160 and video player 150 and other components of system 100A arefurther described below with respect to methods and examples in FIGS.2-5. For exemplary purposes server 160 is described as a single server.However, system 100A is not limited to this implementation and server160 can be implemented in any number of servers.

FIG. 2 is a flow diagram of a method 200 for providing videos havingmultiple tracks of timed text over a network according to an embodimentof the present invention (steps 210-230). In step 210, video data andmultiple tracks of timed text associated with respective video data arestored in database 180. Video data may be stored along with the multipletracks of TT in the same database 180 or on different databases at thesame or different locations. Any type of database (including but notlimited a relational database, or other data structure or service) maybe used to store video data and/or associated multiple tracks of timedtext. Further, database 180 can be stored on one or more storagesdevices. These storages devices can be locally or remotely coupled toone another and to server 160. As an alternative to the storing in step210, video data may be generated dynamically or streamed from a videostreaming source. Video data may be provided in any suitable videoformat including, but not limited to, video formats associated withvideo content incorporated in files or streams.

In step 220, server 160 processes requests for video data with multipletracks of TT. These requests can be received over network 120 fromclient 130. Server 160 retrieves multiple tracks of TT, and returnsmultiple tracks of TT to video player 150 for viewing by the remoteuser. An initial track of timed text may also be sent.

The amount of data sent regarding the multiple tracks of TT can vary indifferent embodiments depending upon how much data is desired to besent, the available bandwidth, storage capacity at client 130, or otherdesign preference or need. In one example, a track list having metadata,such as track name and language for all the multiple tracks, but notimed text is provided to the video player 150. Timed text is thenprovided when specifically requested by video player 150. (An exampleoperation of server 160 and video player 150 with a track list isdescribed further below with respect to FIG. 4).

In another example, a track list having metadata, such as track name andlanguage for all the multiple tracks, and timed text for an initialtrack (or set of initial tracks) is provided to the video player 150.This initial track (or set of initial tracks) can be identified by theserver 160 (or by video player 150) based on user preference, languagepreference, a default value, or other criteria. Timed text for differenttracks is then provided when specifically requested by video player 150as described below with respect to step 230. In another example, a tracklist can be provided having metadata, such as track name and languagefor multiple tracks, along with timed text for all tracks. In thisexample, client 130 receives timed text for multiple tracks more quicklybut may store more timed text data than needed by a particular user.

The metadata on multiple tracks above is illustrative and not intendedto be limiting. Other metadata and combinations of metadata can be used.In another example, metadata can include a format type that identifies atype of format. In one embodiment, two independent kinds of formatmetadata can be used. A source format identifying a data format ofuploaded data and a serving format identifying a data format for a trackserved to a video player.

In one example, server 160 retrieves only multiple tracks of timed textas described above. Video data itself associated with the multipletracks of TT can be streamed separately by a different server orotherwise uploaded separately to a client device 130.

In another example, in addition to retrieving the multiple tracks oftimed text in step 220, server 160 can also retrieve the associatedrequested video data and return the requested video data and multipletracks of TT to video player 150 for viewing by the remote user.

In step 230, server 160 may process further requests for one or moreselected tracks of TT. As mentioned above with respect to step 220, incases where a track list was sent and an initial track of timed text wassent, a user may request a different track of timed text. These requestscan be received over network 120 from client 130. Server 160 thenretrieves the timed text for the requested track, and returns therequests TT to video player 150 for viewing by the remote user.

These examples are illustrative and not intended to necessarily limitthe present invention. Different metadata and track lists may be used aswould be apparent to person skilled in the art given this description.

FIG. 3 is a flow diagram of a method 300 for playing videos havingmultiple tracks of timed text over a network according to an embodimentof the present invention (steps 310-340). In step 310, browser 140 mayenable a user to select video data having multiple tracks of timed text.For instance, a user may direct browser 140 to a web site supported byserver 160 that makes available video data. This web site can list orsupport search of video data available over network 120. A user canselect a desired video through a user-interface at client 130 for playby video player 150. Video player 150 then sends a request to server 160for the requested video. In other examples, step 310 may be carried outby video player 150 itself or the combination of browser 140 and videoplayer 150.

In step 320, video player 150 plays the selected video and an initialtrack of timed text. For instance, client 130 may receive a video and atleast one track of timed text to fulfill a video file request. Client130 then stores the received video and any track metadata including atleast one initial track of timed text. Video player 150 then plays thereceived video and at least one initial track of timed text for viewingby the user. In one example, video player 150 automatically determinesan appropriate location for the timed text to be displayed relative tothe video being played. This can be based on different parameters, ifknown, such as one or more of window size, aspect ratio of the video,user preference, or default value.

FIG. 5 shows an example window 510, timed text 512, and panel 520 that avideo player 150 may provide within a window 500 generated by browser140. Window 510 displays the video associated with the video played insteps 320. Timed text 512 is displayed at or near the video playing inwindow 510. Timed text 512 can be the TT for the initial track played instep 320 as describe above. In one example, TT 512 can be displayedunderneath window 510 as shown in FIG. 5. In another example, TT 512 canbe displayed by video player 150 on top (or embedded within) the videoplaying in window 510. Panel 520 may be also displayed to show metadata(such as track name and language for all multiple tracks) associatedwith a received track list. The initial track metadata associated withTT 512 may be highlighted by a check or other indication (see “German:demo-2” highlighted with a check in the example in FIG. 5). An optionalpanel 530 may be provided which includes further indicia and/or controlelements. Example indicia may be information about the length of time ofthe video, video rating information, and number of times it has beenplayed. Example control elements may be buttons to add a tag, downloadfor a particular computing or operating system platform, display aplaylist, flag as inappropriate, check sender “from user” information,view related comments, toggle continuous playback, or select othervideos.

In step 330, video player 150 may further enable a user to select atrack of timed text. For instance, video player 150 may select to viewpanel 520 and select a different track in panel 520 than the highlightedtrack. For instance, a user may select the track named“Chinese:traditional” in panel 520. Video player 150 then sends arequest for this track of TT to server 160. Alternatively, video player150 may first check to see whether requested track of TT has beenpreviously loaded and stored at client 130.

In step 340, video player 150 plays the requested track of TT. Forinstance, client 130 may receive and store the requested track of TT.Video player 150 then retrieves the requested track of TT from memory inclient 130 and displays the requested track of TT in place of anyinitial track of TT.

Example Process Flow

FIG. 4 shows an example process flow for providing and playing videoshaving multiple tracks of timed text over a network according to afurther embodiment of the present invention. In particular, this processflow shows in further detail how server (S) 160 and video player 150operate with one another in the above example involving a track list.

First, a user may select a video with multiple tracks of TT as describedabove with respect to step 310. Video player 150 may send a request forvideo data 402 to S 160. Video player 150 may also send a request for atimed text track list 404 to S 160. These requests 402, 404 can beseparate or part of a single request.

As described above with respect to step 220, S 160 processes request 402and sends the requested video data 406 to video player 150. S 160processes request 404 and sends an initial track list 408 (i.e., a tracklist having the metadata identifying what tracks the video does have.)to video player 150.

If a user has selected a closed captioning condition to be on, videoplayer 150 may send a request for an initial track of TT 410 to S 160. S160 then sends the requested timed text 412 for the initial track. Videoplayer 150 plays the requested video and the initial track of TT asdescribed above with respect to 320. In an embodiment, S 160 need notserve video data requested in request 402 itself. Instead the video maybe streamed from a separate server (not shown). Such a separate server(or combination of servers) can be responsible for handling requests forvideo data and serving the video data to one or more client devices 130,and in particular to one or more video players 150.

A user may select a different track at video player 150 as describedwith respect to step 330. Video player 150 then sends a request for theselected track of TT 414 to S 160. S 160 then sends TT for the selectedtrack 415 to video player 150. Video player 150 may then play theselected different track of TT in place of the initial track of TT.

These examples are illustrative and not intended to necessarily limitthe present invention. Different metadata and track lists may be used aswould be apparent to person skilled in the art given this description.

Editing Video Having Multiple Tracks of Timed Text Over a Network

According to a further feature, remote editing of online video havingmultiple tracks of timed text is provided. “Editing timed text” as usedherein broadly refers to adding timed text, deleting timed text, and/orchanging timed text.

As shown in FIG. 1B, in one embodiment, a system 100B for editing onlinevideo having multiple tracks of timed text includes a multi-track timedtext editor 110 coupled to network 120. A user-interface 190 can becoupled to multi-track timed text editor 110. Multi-track timed texteditor 110 and U/I 190 can be part of any client device (not shown)capable of communicating over network 120 with server 160.

Multi-track timed text editor 110 communicates with server 160 to enablea user to edit timed text in multiple tracks of associated video files.One or more panels or other control elements may be provided to a user.In one example, a browser is provided as part of or coupled to editor110. In this way, a user can access editor 110 through the browser toview and provide control inputs. U/I 190 can be any type of U/I thatallows a user to interface with a browser and/or editor 110 to carry outediting of video having multiple tracks of TT over network 120.Operation of editor 110 is described further with respect to FIGS. 6-8B.

A method 600 for editing timed text in one or more tracks according toan embodiment is shown in FIG. 6. A user uploads video data (such as avideo file or stream) (step 610). For example, a user may surf with abrowser to web site supported by server 160. A control element (such asa panel) may be displayed to the user to allow the user to select avideo file. S 160 then sends the selected video file and associatedmetadata to editor 110 over network 120. Associated metadata mayinclude, for example, video length, video rating information, number oftimes video played, or other information.

In step 620, a user applies a track name and language of an initialtrack of TT. The name can be any identifying name the user associateswith the initial track. The language can be the language of the timedtext. In the example of the system 100B, a user can input the name andlanguage through U/I 190. Multi-track timed text editor 110 then storesthe name and language as metadata associated with the initial track(step 620). For instance, editor 110 may create a track list thatincludes the metadata (name and language for the initial track). Othermetadata (such as format) can be edited as well.

In step 630, a user may further edit any timed text for the initialtrack. This can include editing timed text corresponding to snippets(timed segments) of the video through multi-track timed text editor 110.

In an embodiment, if a user wishes to edit another track (step 640),then steps 620 and 630 may be repeated, otherwise the method ends (step650).

An example web-based implementation of a multi-track TT editor 110according to an embodiment of the present invention is further describedwith respect to example windows 700 and 800 depicted in FIGS. 7A through7C, 8A and 8B. These windows are illustrative and not intended to limitthe present invention.

As shown in FIG. 7A, when a user wants to create a new caption track, heor she can open input window 700 as depicted in FIG. 7A. In one example,window 700 may be presented by editor 110 through a browser to the user.Input window 700 may contain editing region 710. Editing region 710 mayinclude, but is not limited to, control elements 712, 714, 716 and 718.Control elements 712, 714, 716 and 718 may include buttons, dropdownmenus, links, or other U/I control elements known in the art.

Control element 712 may allow the user to select a language. Forexample, control element 712 may be a drop down listing 720 of languagesas depicted in FIG. 7B. Control element 714 allows a user to input aname for the timed text track. If a user has a text file for the video,he or she can upload it by using control element 716. Alternatively, auser can edit timed text information in a window 722 of control element718 as depicted in FIG. 7C. In one example, timed text may be entered asalternating lines corresponding to lines of text a user wishes to haveappear during the video playback (see e.g., window 720).

In a further example, editor 110 may further allow a user to managecollections of videos that have been uploaded for editing. FIG. 8Adepicts an example window 800 that editor 110 may output. Window 800contains a listing of a user's videos and the status thereof. A user mayedit (add, edit, or delete) one or more of the multiple timed texttracks. Window 800 may contain action regions associated with respectivevideo files. For example, action region 810 may include, but is notlimited to, various control elements for the second video listed inwindow 800. Some control elements may include buttons, dropdown menus,and other control elements known in the art. FIG. 8B shows an examplewhere a control element 812 is a drop down listing of the timed texttracks that are available for a given video.

In an embodiment, server 160 stores the timed text track edited byeditor 110 in a portion of the video file on database 180.Alternatively, the timed text track may be stored in a separate file ondatabase 180 and linked to the video. Once the timed text track is addedto a video, a viewer of the video can then select to play the video withthe submitted timed text track with a video player 150 as describedabove. S 160 can then send the video and timed text track for storage onclient 130 for play by video player 150. S 160 can also stream video tothe video player 150 for play.

Example Computer Implementation

Various aspects of embodiments of the present invention includingsystems 100A, 100B and components therein, such as, client 130, server160, multi-track timed text editor 110, browser 140, and video player150, can be implemented by software, firmware, hardware, or acombination thereof. Client 130, editor 110 and server 160 may each beimplemented on any computing or processing device that supports networkcommunication. Example computing or processing devices include, but arenot limited to, a computer, workstation, distributed computing system,embedded system, stand-alone electronic device, networked device, mobiledevice, set-top box, television, or other type of processor or computersystem. Further, the functionality of client 130, editor 110 and server160 can be distributed across one or more computing or processingdevices at the same or different locations.

Embodiments have been described above primarily with respect to webtechnology; however, the invention is not necessarily limited to the Weband can be used in other environments as would be apparent to personskilled in the art given this description. For instance, video player150 can be run without use of a browser 140 and server 160 may be runwithout use of a web server.

Conclusion

The present invention has been described above with the aid offunctional building blocks illustrating the implementation of specifiedfunctions and relationships thereof. The boundaries of these functionalbuilding blocks have been arbitrarily defined herein for the convenienceof the description. Alternate boundaries can be defined so long as thespecified functions and relationships thereof are appropriatelyperformed.

The foregoing description of the specific embodiments will so fullyreveal the general nature of the invention that others can, by applyingknowledge within the skill of the art, readily modify and/or adapt forvarious applications such specific embodiments, without undueexperimentation, without departing from the general concept of thepresent invention. Therefore, such adaptations and modifications areintended to be within the meaning and range of equivalents of thedisclosed embodiments, based on the teaching and guidance presentedherein. It is to be understood that the phraseology or terminologyherein is for the purpose of description and not of limitation, suchthat the terminology or phraseology of the present specification is tobe interpreted by the skilled artisan in light of the teachings andguidance.

The breadth and scope of the present invention should not be limited byany of the above-described exemplary embodiments, but should be definedonly in accordance with the following claims and their equivalents.

It is to be appreciated that the Detailed Description section, and notthe Summary and Abstract sections, is intended to be used to interpretthe claims. The Summary and Abstract sections may set forth one or morebut not all exemplary embodiments of the present invention ascontemplated by the inventor(s), and thus, are not intended to limit thepresent invention and the appended claims in any way.

1-8. (canceled)
 9. A method for editing a video having multiple tracksof timed text over a network, comprising: (a) uploading video data withmultiple tracks of timed text over the network; (b) applying metadatafor a track; and (c) editing timed text for the track.
 10. The method ofclaim 9, wherein the metadata includes at least one of a name of thetrack, the language of text data in the timed text of the track, anddata format. 11-22. (canceled)
 23. A client device for editing a videohaving multiple tracks of timed text over a network, comprising: amulti-track timed text editor that uploads video data with multipletracks of timed text over the network, applies metadata for a tack, andedits timed text for the track.
 24. The device of claim 23, wherein themetadata includes at least one of a name of the track, the language oftext data in the timed text of the track, and data format.
 25. A systemcomprising: a memory; and a processing device, coupled to the memory,to: a server that processes requests for individual tracks of timed textselected by a remote user over the network; receive, via a network, aninput from the remote user, the input specifying a video capable ofbeing played by video players, the video associated with multiple tracksof timed text, receive a selection from the remote user, identifying alanguage, when the specified video is associated with at least one timedtext track in the selected language, enable the remote user to edittimed text of the at least one timed text track via a multi-track timedtext editor user interface; and when the specified video is notassociated with the at least one timed text track in the selectedlanguage, enables the remote user to add a timed text track for thespecified video in the selected language via the multi-track timed texteditor user interface.
 26. (canceled)
 27. A method for providing videowith multiple tracks of timed text to a video player in a client deviceover a network for display to a remote user, comprising: (a) receiving,from a video player, a first request for a video associated withmultiple tracks of timed text;. (b) in response to the first request in(a), providing a track list associated with the video to the videoplayer, the track list identifying the multiple tracks of timed textassociated with the video; (c) receiving a second request identifying aninitial track of timed text associated with the video, the initial trackselected by the video player based on at least one preference value; (d)retrieving timed text for the initial track; and (e) sending the timedtext retrieved in (d) over the network to the video player for displaywith the video.
 28. The method of claim 27, further comprising: (f)receiving a third request for a particular track from the track list,the particular track being different from the initial track and selectedby a user of the video player; (g) retrieving timed text associated withthe particular track; and (h) sending the timed text retrieved in (g)over the network to the video player for display to the user with thevideo.
 29. The method of claim 28, wherein the particular track is in alanguage different from the initial track.
 30. A method for playingvideos having multiple tracks of timed text provided from a video serverover a network for display to a user, comprising: (a) receiving a firstselection from a user, the first selection identifying a video on thevideo server to play, the video being associated with multiple tracks oftimed text; (b) in response to receipt of the first selection in (a),sending, to the video server, a first request for the video; (c) inresponse to the first request, receiving a track list associated withthe video; (d) selecting an initial track of timed text associated withthe video from the track list based on at least one preference value;(e) sending a second request for the initial track of timed text; and(f) displaying timed text for the initial track with the video.
 31. Themethod of claim 30, further comprising: (g) receiving a second selectionfrom the user, the second selection identifying a particular track fromthe track list, the particular track being, different from the initialtrack; (h) sending a third request for the particular track; and (i)displaying timed text for the particular track with the video.
 32. Themethod of claim 31, wherein the particular track is in a languagedifferent from the initial track.
 33. A system for providing video withmultiple tracks of timed text to a video player in a client device overa network for display to a remote user, comprising: a server that:receives, from a video player, a first request for a video associatedwith multiple tracks of timed text, in response to the first request,provides a track list associated with the video to the video player, thetrack list identifying the multiple tracks of timed text associated withthe video, receives a second request identifying an initial track oftimed text associated with the video, the initial track selected by thevideo player based on at least one preference value, retrieves timedtext for the initial track; and sends the timed text for the initialtrack over the network to the video player for display with the video.34. The system of claim 33, wherein the server: receives a third requestfor a particular track from the track list, the particular track beingdifferent from the initial track and selected by a user of the videoplayer, retrieves timed text associated with the particular track, andsends the timed text associated with the particular track over thenetwork to the video player for display to the user with the video. 35.The system of claim 34, wherein the particular track is in a languagedifferent from the initial track.
 36. A system for playing videos havingmultiple tracks of timed text provided from a video server over anetwork for display to a user, comprising: a video player that: receivesa first selection from a user, the first selection identifying a videoon the video server to play, the video being associated with multipletracks of timed text, in response to receipt of the first selection,sends, to the video server, a first request for the video, in responseto the first request, receives a track list associated with the video,selects an initial track of timed text associated with the video fromthe track list based on at least one preference value, sends a secondrequest for the initial track of timed text, and displays timed text forthe initial track with the video.
 37. The system of claim 36, whereinthe video player: receives a second selection from the user, the secondselection identifying a particular track from the track list, theparticular track being different from the initial track; sends a thirdrequest for the particular track; and displays timed text for theparticular track with the video.
 38. The system of claim 37, wherein theparticular track is in a language different from the initial track. 39.The system of claim 36, further comprising: a browser coupled to thevideo player.
 40. The system of claim 25, wherein the processor isfurther to upload video data with the multiple tracks of timed text overthe network, and apply metadata for the multiple tracks.
 41. The systemof claim 25, wherein the processor is further to receive an input fromthe remote user indicating a selection of multiple videos to delete, anddelete the selected videos and timed text tracks associated with theselected videos in response to the input.
 42. The system of claim 25wherein the processor is further to enable the remote user to editmetadata the at least one timed text track of the selected language. 43.The system of claim 25, wherein the processor is further to create andstore a video file capable of displaying text content of the at leastone timed text track at times specified by the at least one timed texttrack.
 44. The system of claim 25, wherein the processor is further tocreate a video file that includes both the specified video and an editedtimed text of the at least one timed text track for the selectedlanguage.
 45. The system of claim 25, wherein the processor is furtherto store an edited timed text of the at least one time text track in aseparate file linked to the video file.
 46. A computer-implementedmethod, comprising: receiving, via a network, an input from a remoteuser, specifying a video associated with multiple tracks of timed text;receiving, from the remote user, a selection identifying a language;when the specified video is associated with at least one timed texttrack in the selected language, enabling, by a multi-track timed texteditor user interface, the remote user to edit timed text of the atleast one timed text track; and when the specified video is notassociated with the at least one timed text track in the selectedlanguage, enabling, by the multi-track timed text editor user interface,the remote user to add a timed text track for the specified video in theselected language.
 47. The method of claim 46, further comprising:enabling the remote user to upload video data with the multiple tracksof timed text over the network; and enabling the remote user to applymetadata for the multiple tracks of timed text.
 48. The method of claim46, further comprising: receiving, by the multi-track timed text editoruser interface, an input from the remote user indicating a selection ofmultiple videos to delete; and deleting the selected videos and timedtext tracks associated with the selected videos in response to theinput.
 49. The method of claim 46, further comprising: enabling, by themulti-track timed text editor user interface, the remote user to editmetadata describing the at least one timed text track of the selectedlanguage.
 50. The method of claim 46, further comprising creating andstoring a video file capable of displaying text content of the at leastone timed text track at times specified by the at least one timed texttrack.
 51. The method of claim 46, further comprising creating a videofile that includes both the specified video and an edited timed text ofthe at least one timed text track for the selected language.
 52. Themethod of claim 46, further comprising storing an edited timed text ofthe at least one timed text track in a separate file linked to the videofile.
 53. A non-transitory computer-readable storage medium includingdata that, when accessed by a processor, causes the processor to performoperations comprising: receiving, via a network, an input from a remoteuser, specifying a video associated with multiple tracks of timed text;receiving a selection from the remote user, identifying a language;enabling the remote user to edit timed text of at least one timed texttrack in the selected language when the specified video is associatedwith the at least one timed text track; and enabling the remote user toadd a timed text track for the specified video in the selected languagewhen the specified video is not associated with the at least one timedtext track in the selected language.
 54. The computer-readable storagemedium of claim 53, the operations further comprising: enabling theremote user to upload video data with the multiple tracks of timed textover the network; and enabling the remote user to apply metadata for themultiple tracks of timed text.
 55. The computer-readable storage mediumof claim 53, the operations further comprising: receiving an input fromthe remote user indicating a selection of multiple videos to delete; anddeleting the selected videos and timed text tracks associated with thevideos in response to the input.
 56. The computer-readable storagemedium of claim 53, the operations further comprising modifying metadatadescribing the at least one timed text track of the selected language.57. The computer-readable storage medium of claim 53, the operationsfurther comprising creating and storing a video file capable ofdisplaying text content of the at least one timed text track at timesspecified by the at least one timed text track.
 58. Thecomputer-readable storage medium of claim 53, the operations furthercomprising creating a video file that includes both the specified videoand an edited timed text of the at least one timed text track for theselected language.
 59. The computer-readable storage medium of claim 53,the operations further comprising storing an edited timed text of the atleast one timed text track in a separate file linked to the video file.